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MicroRNAs (miRNAs) constitute a recently discovered class 
of noncoding RNAsthat play key roles in the regulation of gene 
expression. Despite being only -20 nucleotides in length, these 
highly versatile molecules have been shown to play pivotal 
roles in development, basic cellular metabolism, apoptosis, 
and disease. While over 24,000 miRNAs have been character- 
ized since they were first isolated in mammals in 2001, the 
functions of the majority of these miRNAs remain largely unde- 
scribed. That said, many now suggest that characterization of 
the relationships between miRNAs and transposable elements 
(TEs) can help elucidate miRNA functionality. Strikingly, over 
20 publications have now reported the initial formation of 
thousands of miRNA loci from TE sequences. In this review we 
chronicle the findings of these reports, discuss the evolution 
of the field along with future directions, and examine how this 
information can be used to ascertain insights into miRNA tran- 
scriptional regulation and how it can be exploited to facilitate 
miRNA target prediction. 



Introduction 

Functional roles for microRNAs have now been described 
in virtually every basic biological process including (in part): 
control of the cell cycle, the regulation of apoptosis, insulin 
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production, lipid metabolism, hypoxia response, immune regula- 
tion and viral defense. 1 Furthermore, patterns of miRNA expres- 
sion are highly regulated both spatially and temporally during 
embryonic development, which suggests that these molecules 
play key roles in cell fate determination and the differentiation 
and maintenance of tissue identity. 2 Also of note, differential 
expressions of miRNAs have been found to be characteristically 
associated with a number of pathologies (e.g., various types of 
cancer, cardiovascular disease and neurological disorders) with 
some of these altered miRNA expressions actually playing causal 
roles in particular malignancies. 3 As a result, miRNAs are pro- 
gressively becoming the focus of considerable research as the 
potential for their use as diagnostic and prognostic biomarkers, 
therapeutic targets, and as regulators of basic cellular metabo- 
lism continue to advance. 

As we propose that determining the genomic events initially 
giving rise to miRNAs can provide novel insight into their indi- 
vidual functions and regulations, this review will focus spe- 
cifically on the relationship between miRNAs and TEs and 
summarize the mounting body of evidence suggesting that the 
majority of miRNAs were initially formed from TE sequences. 
Also of note, although beyond the scope of this review, in addi- 
tion to the relationship between TEs and miRNAs, numerous 
other functional relationships between various TEs and non- 
coding RNAs have now been documented. This suggests that 
integral sequence-based relationships between TEs and noncod- 
ing RNAs (ncRNAs) are perhaps more prevalent than initially 
appreciated (e.g., Alu-mediated turnover of long non-coding 
RNAs (IncRNAs), long-terminal repeat (LTR) TEs providing 
regulatory sequences for long intergenic non-coding RNAs (lin- 
cRNAs), and the formation of other types of short noncoding 
RNAs (endogenous siRNAs and piRNAs) from TE sequences 
(for a general review see Hadjiargyrou 2013 4 ). 
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Figure 1. MiRNA biogenesis and mechanism of origination. (A) Synthesis of microRNAs begins in the nucleus when miRNA genes are transcribed via 
Pol-ll or Pol-Ill into a precursor (pri-miRNA) molecule that is several hundred nucleotides in length. Subsequent processing of this transcript by Drosha 
results in a stem loop -70 nts in length known as a "pre-miRNA". This RNA hairpin is then exported into the cytoplasm where it is trimmed by Dicer into 
a functional, mature -22 nt miRNA. (B) MiRNA mediated gene regulation typically requires base pairing between a specific region within the miRNA 
(generally referred to as a "seed" comprising nucleotides 2 through 8) and a complimentary "seed match" region in the mRNA. Base pairing in this figure 
is indicated by bold vertical lines. The relevant regions of the miRNA and mRNA are shown in red. (C) As reported in this review, it is now known that the 
molecular origins of many miR loci are a result of TE insertions into adjacent positions within the genome. The cartoon depicts a pri-miR transcript being 
generated from transcription across such an area of converging TEs. The arrow indicates the direction of Pol-ll transcription as it reads through a leading 
strand LINE element into a neighboring negative strand containing the same TE. As shown it is evident how such activity would result in the formation 
of a RNA hairpin that could then be processed via the mechanism illustrated in (A). Figure adapted from reference 7. 



Before examining how a miRNA locus can be initially 
formed from TE sequences, we must first detail the basic steps 
involved with miRNA expression. MiRNA biogenesis typi- 
cally begins with expression of an initial miRNA transcript 
that is several thousand nucleotides (nt) in length. 5 Next, these 
long RNA molecules are processed in the nucleus by Drosha 
to generate a -70 nt stem loop known as a pre-miRNA that is 
exported to the cytoplasm once excised. After arriving in the 
cytoplasm, pre-miRNAs enter the RNA interference pathway 
where DICER cleaves and denatures these stem loops produc- 
ing the final, mature single stranded miRNAs, now -20 nt in 
size 6 (Fig. 1A). Once ready, these functionally mature miRNAs 
typically engage in the regulation of gene expression by binding 
to complementary base pairs in the 3'UTRs of target mRNAs, 
typically resulting in gene silencing through either repressing 
translation or triggering mRNA degradation. 1 Interestingly 
while the mechanism of miRNA biogenesis has been fairly well 
described, the most integral facet of miRNA functionality, the 
specific mRNAs they target, continues to be elusive. Numerous 
groups have attempted to tackle this issue by devising various 
strategies for alignment-based search algorithms to identify 
targets, but these efforts have met with only limited success. 
To date, no generally accepted strategy for miRNA target pre- 
diction has been broadly embraced by the miRNA research 
community, primarily due to the inefficiency of current meth- 
ods largely arising from the ability of miRNAs to bind target 



mRNAs with only a few nucleotides of sequence complementar- 
ity 8 (Fig. IB). 

Importantly, several groups now suggest that a fundamen- 
tal insight into how miRNAs were originally formed (first 
provided in 2005 when Smalheiser and Torvik 9 hypothesized 
that miRNA hairpins were formed as a result of the insertion 
of two similar transposable elements into the same genomic 
locus) can be exploited to help elucidate miRNA function- 
ality. In their initial report, Smalheiser and Torvik showed 
that transcription across a juxtaposition of converging TEs 
followed by RNAi processing initially led to the formation 
of several functional miRNAs (Fig. 1C). While this relation- 
ship between miRNAs and TEs was largely underappreciated 
when it was first proposed almost a decade ago, the gen- 
eral model for initial miRNA genomic locus formation from 
TEs has since been corroborated by numerous indepen- 
dent reports and is now becoming generally accepted as the 
mechanism responsible for the formation of thousands of dis- 
tinct miRNAs in plants, animals and fungi. With that in mind 
this review will consist of a summary chronicle of the body of 
reports now independently supporting the initial formation of 
microRNAs from transposable element sequences (outlined in 
Fig. 2) followed by a discussion on the evolution of the field, 
current and future directions, and an in depth examination 
of the utility of this information in elucidating microRNA 
function. 



e29255-2 



Mobile Genetic Elements 



Volume 4 



Tem pel et al. analyzed six genomes and found that -16% of the miRNAs 



had significant sequence homology to individual TEs 



Li et al. analyzed sev 
that aligned significi 



n plant species and found 106 n 
itly with annotated plant TEs 



Piriyapongsa and Jordan found that Madel TEs 
contain palindromic sequences which can form 
imperfect RNA hairpins when transcribed 



Smalheiser and Torvik were the first to 
describe a model for the molecular origin 
of miRNAs from TE sequences 



Devor et al. identified 14 marsupial specific miRNAs 

and found that half were formed from species specific TEs 



Piriyapongsa et al. characterized 
the origins of 55 human miRNA 
loci from TEs 



Yuan et al. described theTE origins of 
226 human miRNAs, 1 1 5 rhesus 
miRNAs, and 141 mouse miRNA genes 



2005 




2006 




2007 




2008 




2009 




2010 




Borchert et al. identifed 46 additional 
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Piriyapongsa et al. described the TEs 
responsible for the initial formation of 
twelve Arabadopsis thaliana miRNAs 
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Yao et al. isolated 21 novel small RNAs from rice and found that 
they originated from MITEs via the same mechanism decribed by 
Piriyapongsa and Jordan 



Yuan et al. showed that all 
eight members of the 
placental-specific miRNA 
gene family miR-1302 were 
derived from MER53 TEs 



Borchert et al. comprehensively examined all of the -15,000 
miRNAs then annotated in miRBase and found roughly 15% of 
them had significant sequence homology to defined TEs 



Shao et al. 
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how TE 
derived 
piRNAs 
transition 
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during 
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Vetukuri et al. categorized small RNAs in Phytophthora infestans based on size 

and found that the 21 nt class corresponding to miRNAs had significant homology to TEs 

Ann et al. were the first to experimentally validate the interactions between 
TE derived miRNAs and AGO proteins involved in RNAi 

Roberts et al. characterized 1,213 additional miRNAs 
originating from TEs by examining the >7000 novel miRNAs 
described since the initial analysis conducted by Borchert et. al 

Creasey et al. experimentally confirmed that thousands 
of TE transcripts are specifically targeted by more than 
50 miRNAs in Arabidopsis thaliana. 
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Figure 2. Timeline illustrating published reports of microRNAs originating from TEs. Beginning in 2005 with the first description of the mechanism 
by which adjacent TE insertions could result in miRNA formation, 23 papers spanning almost a decade are listed in chronological order including the 
most recent publications as of this writing. Colors corresponding to the categories of research progress as used in this review are shown for clarity and 
reference. 



MiRNAs Found to Have Been Formed 
from TE Sequences: A chronology 

2005-2010: Initial reports 

In 2005 Smalheiser and Torvik 9 were the first to describe a 
model for the molecular origin of miRNAs from TE sequences. 
Their initial examination of human, mouse, and rat miRNA loci 
identified 11 miRNA hairpins readily shown to have been ini- 
tially formed from various repetitive sequences (LINES, SINEs, 
LTRs and simple repeats) via the model depicted in Figure 1C. In 
addition, the authors went on to show that these miRNAs were 
also complementary to sequences in a large number of mRNAs 
that contained related TE sequences in their 3' UTRs, leading 
them to hypothesize that miRNA targets might also have arisen 
from TE sequences. Furthermore, in a related report the follow- 
ing year, this same group demonstrated that highly conserved 
Alu elements within the 3' UTRs of many human mRNAs bare 
complementarity to 30 distinct human miRNAs, 10 providing 
further evidence for the initial development of miRNA mediated 
regulatory networks based on TE-derived targets (as illustrated 
in Fig. 3). 

A more comprehensive study in 2006 by Borchert et al., 11 uti- 
lizing the same basic methodology as Smalheiser and Torvik, 9 
was able to identify 46 additional human miRNAs formed from 
converging TEs by including an examination of the sequences 
immediately flanking miRNA hairpins. In addition to 



expanding the repertoire of miRNA formations from converging 
TEs, this group also identified 43 additional miRNAs appar- 
ently processed from hairpins found in the 3' tails of transcribed 
Alu repeats clustered on human chromosome 19, suggesting the 
existence of second, unresolved mechanism for microRNA for- 
mation from TE sequences. 

In 2007, Piriyapongsa and Jordan 12 comprehensively investi- 
gated the relationship between the seven members of the hsa- 
miR-548 family and the Madel MITE (miniature inverted 
repeat transposable element) sequence believed to be responsible 
for their initial formation. Although the mechanism behind the 
formation of the hsa-miR-548 loci was not directly addressed in 
the report by Borchert et al., Piriyapongsa and Jordan recognized 
that the origin of these miRNAs did not agree with Smalheiser 's 
model. 9 Instead, they found that Madel TEs contain palin- 
dromic sequences that form imperfect RNA hairpins when tran- 
scribed and suggested that the initial creation of these miRNAs 
occurred through a second, distinct mechanism of miRNA pro- 
duction from TEs similar to those Borchert et al. identified on 
human chromosome 19. 11 

In a second report in 2007, 13 through utilizing an alterna- 
tive computational strategy, Piriyapongsa et al. characterized 
the origins of 55 human miRNA loci from TEs, independently 
corroborating the findings of Smalheiser 9 and Borchert 11 as well 
as initially describing the origins of nine previously undescribed 
human miRNA loci from TEs. Moreover, as the genomes of 
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distinct taxa are largely populated by unique TE composi- 
tions 14,15 and the authors identified several miRNAs formed 
from TEs unique to particular taxa, their findings led them to 
suggest that TEs are an unappreciated source of taxon-specific 
microRNAs. In addition, the group also went on to predict 85 
novel TE-derived miRNA genes based on the conservation of 
hairpin forming potential within various transposable elements. 
Strikingly, they found 15 of these perfectly aligned with experi- 
mentally cloned miRNAs, indicating the utility of including TE 
sequences in searches for putative miRNAs. 

Also in 2007, providing the first evidence that similar mecha- 
nisms for miRNA locus formation are responsible for the creation 



of miRNAs in plants, Yao 
et al. 16 isolated 21 novel 
small RNAs from rice. 
Interestingly, the hairpin 
precursors corresponding 
to these small RNAs were 
found to have originated 
from MITEs apparently 
via the same mechanism 
Piriyapongsa and Jordan 12 
described as being respon- 
sible for the formation of the 
miR-548 family from MITE 
sequences in humans. 
Although conjecturing that 
the majority of these small 
RNAs represented an inter- 
mediate between siRNAs 
and miRNAs, several of 
these small RNAs have now 
been annotated as belonging 
to the MITE-derived osa- 
miR-2121 family. 17 

In 2008, Piriyapongsa 
et al. 18 next undertook a comprehensive examination of the 
genomic loci corresponding to the then characterized Oryza 
sativa and Arabadopsis thaliana miRNAs describing the TEs 
responsible for the initial formation of 12 of the Arabidopsis 
miRNAs (6.5%) and 83 of the rice miRNAs (35.9%). Of note, 
in contrast to animal miRNAs whose sequences are generally 
found to diverge from their progenitor elements over time and 
consequently not be perfectly maintained, in Arabidopsis, ten 
of the 12 TE-derived miRNAs and 38 of the 83 TE-derived 
miRNAs in rice were found to be 100% identical to consensus 
TEs. In addition, in this work Piriyapongsa et al. also identi- 
fied several examples of individual plant TEs encoding both 
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Figure 3. Cartoon depicting 
the cellular events responsi- 
ble for the formation of many 
miRNAs as well as the network 
of genes they regulate. As 
described in Figure 1, random 
TE insertions into the genome 
at neighboring positions can 
lead to the formation of miR- 
NAs. During the extensive 
period of time it would take 
for this event to occur the 
same TE also likely inserted 
into noncoding regions of 
protein coding transcripts 
elsewhere in the genome. As 
illustrated, this series of events 
can result in the formation of 
a network of genes capable of 
regulation by the TE-derived 
miRNA. Figure adapted from 
reference 7. 
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siRNAs and miRNAs, leading them to hypothesize (much like 
Yao et al. 16 ) that many miRNAs may have originally functioned 
as siRNAs, and that this siRNA-to-miRNA transition might 
illustrate how siRNAs initially charged with silencing TEs could 
later be exploited for additional levels of gene regulation. 

In 2009, Devor et al. 19 performed a comprehensive charac- 
terization of the miRNAs they experimentally identified in 
Monodelphis domestica, including an examination of the cor- 
relation between these molecules and TEs. In addition to the 
174 miRNAs they found in M, domestica that were conserved 
across mammals, the group also identified 14 miRNAs that were 
unique to marsupials, including three exclusive to M. domestica. 
Strikingly, when mapped against repeat data sets they found that 
seven (half of the lineage-specific microRNAs) aligned with TEs 
(six with L1/L2 LINEs and one with a Mariner DNA transpo- 
son), and that these repeat elements were themselves marsupial 
specific. Further analysis revealed that two of three M. domestica 
specific miRNAs originated from within a large genomic clus- 
ter of 39 miRNAs and that this entire region was flanked on 
both ends by a marsupial specific LI element, leading the team 
to believe that this juxtaposition was responsible for the forma- 
tion of the entire cluster of miRNAs via the mechanism first 
proposed by Smalheiser et al. 9 Importantly, by demonstrating 
that species-specific miRNAs arose from species-specific TEs, 
the authors' report further strengthened the argument originally 
suggested by Piriyapongsa et al., 12 in that taxon-specific TEs fre- 
quently give rise to taxon-specific miRNAs uniquely attuned to 
the organism, as well as further supporting the broader idea that 
TEs are routinely involved in the emergence of new miRNAs. 

In 2010 Yuan et al. 20 investigated the origin and evolution 
of the placental-specific miRNA gene family miR-1302 and 
showed that all eight members of this family were derived from 
MER53 TEs. The group also identified 36 potential paralogs of 
this miRNA in the human genome and another 58 orthologs 
conserved across placental mammals and suggested that all of 
these were similarly formed from MER53 TEs. 

2011: The year of comprehensive computational analyses 

In 2011, Yuan et al. 21 published an analysis of how mam- 
malian miRNA families originally formed from TEs uniquely 
expanded within three genomes (human, rhesus, and mouse). 
Using a novel strategy, the authors looked at the coverage den- 
sity of TEs within these genomes and determined if individual 
miRNA genes originated from particular TEs based on sequence 
homology. By employing this computational methodology the 
group successfully described the TE origins of 226 human miR- 
NAs, 115 rhesus miRNAs, and 141 mouse miRNA genes from 
various LINEs, SINEs, LTRs and DNA transposons. 

Next in 2011, Borchert et al. 22 performed the first ever com- 
prehensive analysis of the genomic events responsible for miRNA 
formation (examining all of the -15,000 miRNAs annotated in 
miRBase 17,23 at the time) by employing a computational meth- 
odology developed to align miRNA sequences to the principle 
data sets for TEs and noncoding RNAs (ncRNAs). In all, the 
authors found that roughly 15% of analyzed miRNAs had sig- 
nificant sequence homology to defined TEs. The authors pro- 
posed that the majority (-89%) of these miRNAs originated 



via the model depicted in Figure 1C, with the remaining 
-11% instead corresponding to characteristic hairpin-forming 
sequences within individual TEs (e.g., MITE sequences previ- 
ously suggested to produce miRNAs 12 ). Of the 2,392 miRNAs 
Borchert et al. found to have TE origins, DNA transposons were 
most frequently responsible for miRNA generation (891). The 
rest originated from LTR retrotransposons (414), non-LTR ret- 
rotransposon (814), LINEs (312), SINEs (353), satellites (137) 
and others (136). Interestingly, sequences contained within the 
"other" category had significant sequence identity to known 
noncoding RNA sequences such as snoRNAs and tRNAs, each 
of which have been speculated to have contributed to the forma- 
tion of novel mobile genetic elements. 24,25 Based on their findings 
the authors further advanced the hypothetical proposition that 
miRNA based regulatory systems first arose and subsequently 
persisted over time as a result of the obvious advantage conveyed 
by the ability to regulate host genes containing portions of the 
TE from which a miRNA was formed (Fig. 3). Since these gene 
networks have been found in organisms as primitive as protozoa, 
the authors went on to speculate that miRNA based regulation 
of multiple genes may have been the catalyst in the evolution of 
more sophisticated developmental systems. 

Also in 2011, Li et al. 2S found that a substantial number of 
previously described, experimentally isolated plant miRNAs 
were homologous to TEs as well as to TEs contained within 
mRNA transcripts. In all, in an examination of seven plant spe- 
cies the authors found 106 miRNAs aligning significantly with 
annotated plant TEs, and similar to the report by Zhang et al., 27 
the authors found that -80% of these TE-derived miRNAs were 
apparently initially formed from MITE TE sequences. 

Lastly in 2011, Zhang et al. 27 examined the origin and evo- 
lution of miRNA loci in flowering plants. After conducting 
genome wide analyses of Oryza sativa and Arabidopsis thaliana 
they described four potential molecular mechanisms responsible 
for the formation of miRNAs — two of which involve TEs. In 
particular, in agreement with the earlier work of Yao et al., 16 the 
authors found that many of the miRNA genes in rice had notable 
sequence overlap with MITEs. In all, they found 85 of the 290 
characterized rice miRNA genes were apparently formed from 
TEs, with over half of these corresponding to MITEs. Also, in 
agreement with previous work by Smalheiser 9 and Borchert, 11 
preliminary target predictions for these miRNAs suggested that 
many of the TE-derived miRNAs (45 of 85) characterized in this 
work had target sites bearing complete homology with the same 
TE, giving rise to individual miRNAs. This lead the authors to 
propose that the transposition of this TE into other genes could 
yield other target sites for the miRNA to interact with and sub- 
sequently allow for the creation of regulatory systems that could 
enhance the evolutionary capacity of the organism (as illustrated 
in Fig. 3). 

2012: Enter next generation sequencing 

In 2012 Shao et al. 28 employed next generation sequencing 
(NGS) to categorize small RNAs found in chickens and dem- 
onstrated how TE derived piRNAs transition to form miRNAs 
during embryonic development. They examined the expression 
levels of miRNAs throughout development and found that most 
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were dynamically regulated throughout formation with many of 
them targeting signal transduction pathways related to reproduc- 
tion and embryogenesis. It was also shown that the TEs giving 
rise to the initial piRNAs were the most abundant ones in the 
genome. 

Also in 2012, Cai et al. 29 found numerous TE derived small 
RNAs (182 miRNAs) in the Bombyx mori (silkworm) genome. 
In the first study of its kind in the silkworm, the authors system- 
atically discovered TE-associated small RNAs transcribed from 
the B. mori genome through employing a deep RNA-sequencing 
strategy yielding 182, 788 and 4,990 TE-associated small RNAs 
corresponding to miRNAs, siRNAs and piRNAs, respectively. 

Next in 2012, Tempel et al. 30 undertook an extensive study to 
map all miRNA precursor sequences in miRBase 17,23 to several 
genomes in order to determine if any had overlapping sequences 
with TEs. They used an automated method called ncRNA clas- 
sifier in order to catalog the interaction between the TEs and 
pre-ncRNAs. By analyzing six genomes (frog, human, mouse, 
nematode, rat and sea squirt) the group found that -16% of the 
miRNAs (strikingly similar to the -15% reported by Borchert 
et al. 22 in 2011) had significant sequence homology to individual 
MITEs, DNA transposons, LTR/ERV, CR1/RTE, Lis, SINEs, 
and other non-LTRs. 

Also in 2012, Nosaka et al. 31 showed that miRNAs formed 
from a class of TEs in rice are involved in the suppression of host- 
mediated TE silencing. In plants TEs are typically suppressed 
epigenetically via small RNA directed DNA methylation, but 
in this instance the authors showed that miR-820 family mem- 
bers originating from CACTA DNA transposons were actually 
targeting and repressing one of the methyltransferase genes, 
OsDRM2, responsible for epigenetic suppression. This pur- 
ported ability of the TE to utilize miRNAs as a countermeasure 
to host silencing and subsequently allow further TE insertion 
into the genome elucidates a novel function of TE derived miR- 
NAs and provides insight into the possible evolutionary forces 
driving the relationship between TEs and miRNAs. 

Lastly in 2012, Vetukuri et al. 32 examined data obtained 
from small RNA mediated RNAi processes in the fungus 
Phytophthora infestans, the oomycete pathogen responsible for 
late blight in Solanacea. They categorized the small RNAs based 
on size (21nt, 25/26nt, and 32nt) and found that the majority 
were homologous to LTR retrotransposons within the fungal 
genome. Notably, the 21 nt class of small RNAs corresponding 
to miRNAs showed the most homology to the transposable ele- 
ments. Interestingly, the group also identified six putative miR- 
NAs with characteristics of both plant and metazoan miRNAs. 

2013: The year of experimental validation for TE-derived 
miRNAs in RNAi 

In 2013 Ahn et al. 33 examined miRNAs specifically formed 
from palindromic MER (Medium Reiteration frequency) TEs in 
the genomes of primates, rodents, and rabbits. After identifying 
three specific miRNAs derived from MER96, they next experi- 
mentally validated the interactions between these MER-derived 
miRNAs with the catalytic AGOl, AG02, and AG03 proteins 
involved in the RNAi gene silencing pathway. Importantly, this 
work constituted the first ever definitive demonstration that 



miRNAs derived from TEs can be processed via the same RNAi 
mechanism as other non-TE derived miRNAs. 

Also in 2013, driven by the contested authenticity of many of 
the annotated miRNAs in rice due to sequence homology with 
TEs, Ou-Yang et al. 34 examined the association between miR- 
NAs originating from TEs and the RISC (RNA-induced silenc- 
ing complex) proteins responsible for miRNA function. To this 
end, the group characterized seven miRNAs substantially cor- 
responding to MITEs TE sequences that were complexed with 
AGOl in immunoprecipitation assays, further substantiating 
that TE-derived miRNAs are in fact involved with RISC regula- 
tions in the same way as other silencing ncRNAs. 

Next in 2013, Spengler et al. 35 experimentally validated that 
miRNAs and miRNA target sites derived from common human 
TEs such as LINE2, MIR, and Alus are functional. Importantly, 
the authors experimentally demonstrated that TEs embedded 
within the 3' UTR of genes can serve as the source of target 
sites for many miRNAs. Specifically, 3' UTR embedded Alus 
were shown to be the origin of target sites for miR-24 and miR- 
122, and an Alu-derived microRNA, miR-1285-1, was shown 
to regulate genes containing target sites with homologous Alu 
elements. 

Finally in 2013, building on the previous comprehensive 
analysis by Borchert et al., 22 Roberts et al. 7 characterized 1,213 
additional miRNA origins from TE elements by examining 
the > 7,000 novel miRNAs described since the earlier analysis, 
bringing the total number of miRNA loci origins defined by 
this group to 3,605. In all, these studies have comprehensively 
defined the origins of roughly 15% of the miRNAs currently 
annotated in miRBase as being formed from TE sequences. 

2014: Broader acceptance? 

As recently as early 2014 Yu et al. 36 showed that the spring 
wheat miRNA TamiR-1123 originated from a family of MITEs. 
A gene involved in the vernalization of spring wheat, Vrn-Ala, 
contains a MITE in its promoter that is able to be transcribed 
into a stable RNA hairpin. This MITE also contains sequences 
that are homologous to TamiR-1223, and both the MITE 
derived RNA hairpin and TamiR-1223 were detected when a gel 
containing small RNAs was probed with labeled TamiR-1223. 
Based on this evidence, the authors concluded that TamiR-1223 
originated from the MITE, and that Vrn-Ala is potentially regu- 
lated by this TE-derived microRNA. 

Most recently, in an extremely exciting report in a 2014 
Letter to Nature, Creasey et al. 37 used parallel analysis of RNA 
ends (PARE) sequencing to experimentally demonstrate that 
thousands of transposon transcripts are specifically targeted by 
more than 50 miRNAs for cleavage and processing by RNA- 
dependent RNA polymerase 6 (RDR6) in Arabidopsis thaliana. 

Discussion 

Taken together the reports chronicled here clearly indicate 
that the relationship between miRNAs and TEs is much more 
significant than what was originally believed when the model 
was first proposed in 2005. Specifically, there is now abundant 
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evidence indicating that transposable elements are directly 
involved in the initial formation of an appreciable number of 
miRNA loci, a quantity likely much higher than currently 
defined due to the progressive degeneration of TEs and the lim- 
ited availability of complete miRNA and TE data sets. After 
fully reviewing the body of these works, we suggest that a thor- 
ough understanding of the connections between specific TEs 
and microRNAs can be directly employed to: (Utility I) ascer- 
tain insights into miRNA transcriptional regulation, and also to 
(Utility II) facilitate accurate miRNA target prediction. 
Utility I 

Knowing the TEs giving rise to a miRNA can provide a 
deeper understanding into the elements controlling their expres- 
sion. This was first demonstrated in 2006 11 when 43 human 
miRNAs located immediately downstream of Alu transposable 
elements were shown to be transcribed by RNA polymerase 
III (Pol-HI) as the 3' ends of distinct Alu transcripts. Amidst 
analyzing the set of all known human miRNA loci for TE rela- 
tionships, this study described a large cluster of primate-specific 
miRNAs located on chromosome 19 (C19MC) 38 in which indi- 
vidual miRNAs were consistently flanked upstream (-100 bp) by 
intact Alu repeats. Although the accepted paradigm prior to this 
report was that miRNAs were exclusively transcribed by RNA 
polymerase II (Pol-II), this group elected to examine whether 
these miRNAs were actually being transcribed by Pol-III as the 
3' tails of Alu repeats as Alu elements were well documented as 
being transcribed by Pol-III. These researchers found that the 
Alus located upstream of the miRNAs in the C19MC contained 
the sequences necessary for Pol-III expression. 39,40 Significantly, 
in each of their experimental assays (cell free transcription, 
expression constructs, and chromatin immunoprecipitation), 
the authors found that Pol-III was associated with the C19MC 
miRNA promoters and responsible for the expression of these 
miRNAs directly contradicting the assumption that miRNAs 
were exclusively transcribed via Pol-II. Importantly, through 
identifying the particular TEs responsible for the formation of 
the miRNA genomic loci in the C19MC, these authors obtained 
direct insights into the mechanisms behind the transcriptional 
regulation of these miRNAs, clearly illustrating the value of 
determining the TEs responsible for forming individual miRNA 
loci in ascertaining their transcriptional regulation. 

Utility II 

Beyond the transcriptional insights provided by knowing 
the genomic origins of individual miRNAs, perhaps the great- 
est potential utility of this information is in facilitating accurate 
miRNA target prediction. Several groups have now suggested 
that a subset of miRNAs may preferentially target TE sequences 
located in mRNA 3'UTRs. 10,12,35,41 Based on this, Filstein 
et al. 41 recently developed a genuinely novel approach to iden- 
tifying miRNA targets through speculating that a miRNA and 
its related mRNA target site might actually be created concur- 
rently by the continuing mobilization of a common ancestral TE 
(Fig. 3). In their initial report this group suggested that the accu- 
racy of target identification algorithms could be significantly 
improved by limiting searches to mRNAs that contain the TEs 
identified as being responsible for the formation of individual 



miRNAs. After developing a computational methodology based 
on this, OrBId (Origin Based Identification), the authors gener- 
ated putative mRNA target sets for 191 human miRNAs with 
defined TE origins. While the authors found their methodology 
was best suited for predicting targets of taxon-specific miRNA 
loci formed more recently in evolutionary time, they also found 
this strategy was capable of successfully predicting targets for 
the evolutionarily older, mammalian-conserved miR-28 fam- 
ily — targets found to be largely in agreement with both conven- 
tional target prediction algorithms 42,43 and existing experimental 
evidence. 44 While further validation and a more comprehensive 
comparison of OrBId target sets to those generated by existing 
methods will be required to further substantiate this innovative 
target prediction strategy, this work clearly advances the idea 
that the mRNA targets of miRNAs with defined TE origins can 
successfully be predicted based on a common TE origin shared 
by a miRNA and target site. Importantly, this point was recently 
significantly corroborated in a report by Spengler et al. 35 (dis- 
cussed in the preceding chronology) which definitively experi- 
mentally demonstrated that common human TEs such as Alu 
elements contained within the 3'UTRs of active genes function 
endogenously as miRNA target sites. 

In terms of future directions of the field as a whole, since all 
the conclusions drawn in this review are limited by the currently 
available data, further advances in our understanding of the rela- 
tionships between microRNAs and TEs will depend upon and 
advance along with novel, ongoing miRNA and TE discovery 
and annotation. Furthermore, we suggest that only a fraction of 
characterizable TE:: miRNA relationships have been defined to 
date. First, not all microRNAs and TEs have been identified. 
RepBase, the most commonly used repetitive element database, 
and its microRNA equivalent database — miRBase — are con- 
stantly being updated and therefore new elements for examina- 
tion are provided with every update to these resources. 15,40,45 

Novel, ongoing miRNA 

Additionally, in terms of miRNA discovery, currently anno- 
tated miRNAs are biased toward evolutionarily older and non- 
repetitive sequences, as it was typical of many of the initial 
miRNA cloning efforts to discard all sequences that were homol- 
ogous to transposable elements. 38,46 Furthermore, early miRNA 
cloning studies also commonly discarding pools of "tRNA 
degradation products" 38,46 (e.g., short RNAs corresponding to 
tRNAs and snoRNAs) that were -20 nt in size. Today, however, 
next generation sequencing (NGS) of RNA populations immu- 
noprecipitating with miRNA protein complexes will likely help 
redefine many of these previously omitted short RNAs as func- 
tional miRNAs. Clearly, recent reports dealing with TE-derived 
miRNA endogenous functions such as the one by Spengler et al. 35 
discussed above, as well as recent studies identifying tRNA and 
snoRNA 26,43 sequence fragments complexed with RNAi machin- 
ery, suggest that these previously overlooked short noncoding 
RNAs are actually being processed by the RISC machinery and 
participate in miRNA-like regulations. As many SINEs were 
initially formed from tRNA sequences, 26,43 and snoRNAs have 
also been shown to propagate through genomes and increase in 
number by retrotransposition, 26,43 it will be interesting to see if 
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these tRNA- and snoRNA-derived miRNA like elements were 
formed through similar mechanisms and behave similarly to 
other TE-derived miRNAs or if they instead constitute a distinct 
set of small RNAs with unique properties. 
TE discovery and annotation 

In terms of TE discovery, identification of the elements 
responsible for miRNA origins (as well as their mRNA targets 
that arose from TE insertion into expressed sequences) becomes 
increasingly less accurate the farther back in evolutionary time 
an insertion event occurred. Given that miRNAs have been well 
characterized as regulators in both situations of nutrient depriva- 
tion and abiotic stress, 4,1 1,35 ' 3S ' 41 although harder to describe, we 
speculate that the cryptic beginnings of some of the more archaic 
miRNA loci might also be explained via TE origins. Within 
this model, selective pressure to preserve only the sequences 
required for stem loop structure, target recognition, and tran- 
scriptional regulation could account for the degradation of other 
noncritical components, and thus result in increased difficulty 
when identifying TE-derived miRNAs and their associated TE 
containing mRNA targets. More simply, genomes are highly 
plastic, and constituent sequences that provide no significant 
advantage are eventually degenerated (e.g., if only a 30 bp por- 
tion of a 7 kb LINE insertion conveyed a meaningful benefit, 
the 30 bp segment would be maintained while the remainder of 
the 7 kb would ultimately degrade over time). Fortunately new 
algorithms are being developed that focus on identifying TEs 
that have degenerated beyond the ability of Repeat Masker 15 to 
identify them. One such program, Greedier, 47 has recently been 
used to facilitate the discovery of TEs across eukaryotes by tak- 
ing into account the fragmentation of repeats. This method is 
proving particularly useful at identifying degenerate TEs within 
genomes that would escape identification by RepeatMasker 
alone. Programs such as this, or others employing alternate 
strategies for characterizing degenerate TE sequences, may well 
be utilized in future analyses to successfully characterize more 
ancient miRNA: TE relationships. 

Beyond the relationship between miRNAs and TEs continu- 
ing to advance through novel miRNA and TE discovery, the 
acceptance of this relationship by the broader miRNA com- 
munity is also of importance. Strikingly, it should be noted 
that miRNAs are not the only TE-derived small RNAs known 
to participate in RNAi. In fact, although not directly discussed 
in this review, it is widely accepted that both of the two other 
principle classes of short RNAs engaged in RNAi (endogenous 
siRNAs and piRNAs) are also processed from and correspond 
to transposable elements (recently reviewed by Hadjiargyrou 4 ). 
In contrast, despite several groups having now independently 
published reports (as summarized here) clearly describing the 
genomic origins of thousands of individual microRNAs from 
an array of transposable element sequences, it is still common 



practice to discard small RNA sequence reads that readily align 
to TEs when attempting to experimentally identify novel miR- 
NAs. 17,23 We suggest this represents a significant error in the con- 
ventional microRNA discovery pipeline that should be corrected 
by those continuing to eliminate sequences corresponding to 
TEs from consideration as miRNAs. 

Furthermore, beyond the functional implications of charac- 
terizing miRNA: TE relationships, these connections may well 
also play underappreciated roles in speciation. As an example, 
Filstein et al. 41 found the genomic origins of more recently 
established, taxon-specific miRNA-loci to be readily definable, 
characterizing 113 human miRNA genomic loci as having been 
formed from primate-specific Alu TEs. As they found Alu inser- 
tions into expressed regions of the human genome were respon- 
sible for the formation of numerous human miRNA loci, 48 the 
authors speculated that the continued expansion of Alus within 
the genome does not constitute a failure of the RNAi pathway 
to inhibit Alu transposition, but instead represents a beneficial 
genetic partnership whereby the insertion of Alu elements into 
noncoding sections of transcripts has culminated in minor alter- 
ations of gene expression that have ultimately led to a heightened 
rate of adaptation for the human genome that could potentially 
be in part responsible for our unique complexities in comparison 
to other primates. 

In conclusion, while the events leading to the initial forma- 
tion of many miRNA loci may never be fully characterized, 
the cumulative efforts of the reports summarized in this review 
unequivocally demonstrate that a notable percentage of miRNAs 
were initially formed from TE sequences, with many more likely 
to be identified as new miRNAs and TEs are elucidated. As these 
additional relationships continue to be defined, it will be inter- 
esting to see how much of an impact the development and evolu- 
tion of programs like OrBId will have on our ability to identify 
endogenous miRNA targets, as well as how much experimental 
demonstrations that TE-derived microRNAs actually do partici- 
pate in endogenous regulations (like those recently published in 
2013 and 2014 by Ahn et al., 33 Ou-Yang et al., 34 Spengler et al. 35 
and Creasey et al. 37 ) will lead the broader miRNA community 
to further embrace the relevance of these microRNAs and their 
relationship to transposable elements. 
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